9 research outputs found

    Measuring the similarity of two cohorts in the n-dimensional space

    Get PDF
    Measuring the similarity of the case and control group in clinical studies has always been an important but also difficult task. Several statistics-based methods aimed at this exist but most of them utilize dimension reduction or estimation, therefore, there are certain cases where they are not adequate. In this paper, we propose 3 dissimilarity-based measures capable of evaluating case-control group pairs without the loss of valuable information

    Managing polyglot systems metadata with hypergraphs

    Get PDF
    A single type of data store can hardly fulfill every end-user requirements in the NoSQL world. Therefore, polyglot systems use different types of NoSQL datastores in combination. However, the heterogeneity of the data storage models makes managing the metadata a complex task in such systems, with only a handful of research carried out to address this. In this paper, we propose a hypergraph-based approach for representing the catalog of metadata in a polyglot system. Taking an existing common programming interface to NoSQL systems, we extend and formalize it as hypergraphs for managing metadata. Then, we define design constraints and query transformation rules for three representative data store types. Furthermore, we propose a simple query rewriting algorithm using the catalog itself for these data store types and provide a prototype implementation. Finally, we show the feasibility of our approach on a use case of an existing polyglot system.Peer ReviewedPostprint (author's final draft

    A doxorubicinkezeléshez kapcsolódó szívelégtelenség kialakulásának rizikótényezői a hazai országos adatbázisok integrált, retrospektív elemzése alapján [Analysing the risk factors of doxorubicin-associated heart failure by a retrospective study of integrated, nation-wide databases]

    Get PDF
    Bevezetés: Az antraciklinkezeléshez kapcsolódó szívelégtelenség kialakulását jelentősen befolyásolja az alkalmazott kumulatív dózis. Korábban publikált adatok szerint doxorubicin esetén, 450 mg/m2 alatti kumulatív dózis mellett, alacsony a szívelégtelenség kialakulásának rizikója. Mivel a jelenlegi gyakorlatban a doxorubicinterápia során általában nem érik el ezt a dózist, a kezelés következtében kialakuló szívelégtelenség kiváltásában egyéb tényezők játszanak jelentős szerepet. Célkitűzés: Célunk a jelenlegi gyakorlat szerint alkalmazott doxorubicinkezeléshez kapcsolódó szívelégtelenség rizi- kótényezőinek részletes feltárása volt. Módszer: A hazai egészségügyi finanszírozási adatbázisok és a Nemzeti Rákregiszter adatainak felhasználásával retros- pektív elemzést végeztünk, melybe azokat a betegeket vontuk be, akiknél 2004 és 2015 között emlődaganat iga- zolódott szövettani vizsgálattal. Kizárólag azokat a betegeket elemeztük, akiknél a kórelőzményben nem szerepelt kemoterápia vagy szívelégtelenségre utaló adat a daganat igazolódása előtt. A szívelégtelenségi végpontot az I50-es diagnóziskódnak a fekvőbeteg- vagy a boncolási dokumentumban való megjelenésével definiáltuk. Statisztikai analízis: A szívelégtelenség kialakulásának esélyét befolyásoló tényezőket többváltozós bináris logisztikus regresszió alkalmazásával azonosítottuk. A társbetegségek és a demográfiai adatok mellett az onkológiai stádiumot és az onkológiai kezelések kumulatív dózisait is figyelembe vettük az elemzésben. Eredmények: A 3288, doxorubicinnel kezelt betegnél a szívelégtelenségi végpont kumulatív incidenciája 6,2%-nak adódott. A szívelégtelenség előfordulása fokozódott 400 mg/m2 fölötti doxorubicin kumulatív dózis esetén. Nagy- mértékben nőtt a rizikó az életkor előrehaladtával is, már 50 év felett szignifikáns kockázatnövekedés volt megfigyel- hető. Emellett magasabb rizikóval kapcsolódott a cukorbetegség, a pirimidinanalógok, a karboplatin (platinaalapú szer) és a bevacizumab jelenléte. Következtetés: A hazai finanszírozási adatbázisok és a Rákregiszter adatbázisának integrált elemzése révén a jelenlegi gyakorlatnak megfelelően alkalmazott doxorubicinkezeléshez kapcsolódó szívelégtelenség rizikótényezői populáció- szinten azonosíthatók voltak

    Weighted nearest neighbours-based control group selection method for observational studies.

    No full text
    Although in observational studies, propensity score matching is the most widely used balancing method, it has received much criticism. The main drawback of this method is that the individuals of the case and control groups are paired in the compressed one-dimensional space of propensity scores. In this paper, such a novel multivariate weighted k-nearest neighbours-based control group selection method is proposed which can eliminate this disadvantage of propensity score matching. The proposed method pairs the elements of the case and control groups in the original vector space of the covariates and the dissimilarities of the individuals are calculated as the weighted distances of the subjects. The weight factors are calculated from a logistic regression model fitted on the status of treatment assignment. The efficiency of the proposed method was evaluated by Monte Carlo simulations on different datasets. Experimental results show that the proposed Weighted Nearest Neighbours Control Group Selection with Error Minimization method is able to select a more balanced control group than the most widely applied greedy form of the propensity score matching method, especially for individuals characterized with few descriptive features

    Optimized Weighted Nearest Neighbours Matching Algorithm for Control Group Selection

    No full text
    An essential criterion for the proper implementation of case-control studies is selecting appropriate case and control groups. In this article, a new simulated annealing-based control group selection method is proposed, which solves the problem of selecting individuals in the control group as a distance optimization task. The proposed algorithm pairs the individuals in the n-dimensional feature space by minimizing the weighted distances between them. The weights of the dimensions are based on the odds ratios calculated from the logistic regression model fitted on the variables describing the probability of membership of the treated group. For finding the optimal pairing of the individuals, simulated annealing is utilized. The effectiveness of the newly proposed Weighted Nearest Neighbours Control Group Selection with Simulated Annealing (WNNSA) algorithm is presented by two Monte Carlo studies. Results show that the WNNSA method can outperform the widely applied greedy propensity score matching method in feature spaces where only a few covariates characterize individuals and the covariates can only take a few values

    Prediction for Future Yaw Rate Values of Vehicles Using Long Short-Term Memory Network

    No full text
    Currently, electric mobility and autonomous vehicles are of top priority from safety, environmental and economic points of view. In the automotive industry, monitoring and processing accurate and plausible sensor signals is a crucial safety-critical task. The vehicle’s yaw rate is one of the most important state descriptors of vehicle dynamics, and its prediction can significantly contribute to choosing the correct intervention strategy. In this article, a Long Short-Term Memory network-based neural network model is proposed for predicting the future values of the yaw rate. The training, validating and testing of the neural network was conducted based on experimental data gathered from three different driving scenarios. The proposed model can predict the yaw rate value in 0.2 s in the future with high accuracy, using sensor signals of the vehicle from the last 0.3 s in the past. The R2 values of the proposed network range between 0.8938 and 0.9719 in the different scenarios, and in a mixed driving scenario, it is 0.9624

    Aggregated Rankings of Top Leagues’ Football Teams: Application and Comparison of Different Ranking Methods

    No full text
    In this study, the effectiveness and characteristics of three ranking methods were investigated based on their performance in ranking European football teams. The investigated methods were the Thurstone method with ties, the analytic hierarchy process with logarithmic least squares method, and the RankNet neural network. The methods were analyzed in both complete and incomplete comparison tasks. The ranking based on complete comparison was performed on match results of national leagues, where each team had match results against all the other teams. In the incomplete comparison case, in addition to the national league results, only a few match results from international cups were available to determine the aggregated ranking of the teams playing in the top five European leagues. The rankings produced by the ranking methods were compared with each other, with the official national rankings, and with the UEFA club coefficient rankings. In addition, the correlation between the aggregated rankings and the Transfermarkt financial ranking was also examined for the sake of interest
    corecore